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Abstract 

In this paper we describe how to modify GSAT so that it can be applied to non-clausal 
formulas. The idea is to use a particular "score" function which gives the number of clauses 
of the CNF conversion of a formula which are false under a given truth assignment. Its 
value is computed in linear time, without constructing the CNF conversion itself. The 
proposed methodology applies to most of the variants of GSAT proposed so far. 

1. Introduction 

GSAT (Selman, Levesque, & Mitchell, 1992; Selman & Kautz, 1993) is an incomplete 
model-finding algorithm for clausal propositional formulas which performs a randomized 
local search. GSAT has been shown to solve many "hard" problems much more efficiently 
than other traditional algorithms like, e.g., DP (Davis & Putnam, 1960). Since GSAT 
applies only to clausal formulas, using it to find models for ordinary propositional formulas 
requires some previous clausal-form conversion. This requires extra computation (which can 
be extremely heavy if the "standard" clausal conversion is used). Much worse, clausal-form 
conversion causes either a large increase in the size of the input formula or an enlargement 
of the search space. 

In this paper we describe how to modify GSAT so that it can be applied to non-clausal 
formulas directly, i.e., with no previous clausal form conversion. An extended version of the 
paper (Sebastiani, 1994) provides the proofs of the theorems and a detailed description of 
the algorithm introduced. 

This achievement could enlarge GSAT's application domain. Selman et al. (1992) sug- 
gest that some traditional AI problems can be formulated as model-finding tasks; e.g., visual 
interpretation (Reiter & Mackworth, 1989), planning (Kautz & Selman, 1992), generation 
of "vivid" knowledge representation (Levesque, 1986). It is often the case that non-clausal 
representations are more compact for such problems. For instance, each rule in the form 
"Ai ^ ^" gives rise to several distinct clauses if some (pi are disjuncts or ^ is a con- 
junct. In automated theorem proving (a.t.p.) some applications of model-finding have been 
proposed (see, e.g., (Artosi & Governatori, 1994; Klingerbeck, 1994)). For instance, some 
decision procedures for decidable subclasses of first-order logic iteratively perform non- 
clausal model-finding for propositional instances of the input formulas (Jeroslow, 1988). 
More generally, some model-guided techniques for proof search, like goal deletion (Ballan- 
tyne & Bledsoe, 1982), false preference, or semantic resolution (Slaney, 1993), seem to be 
applicable to non-clausal a.t.p. as well. 
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procedure GSAT((t>) 

for j := 1 to Max-tries do 
T := initial ((j)) 
for := 1 to Max- flips do 
if T ^ (/) 
then return T 

else Poss-flips := hill- climh ((f), T ) 
V := pick(Poss- flips) 
T:=flip(V,T) 
Updates cores ((j), V) 

end 

end 

return "no satisfying assignment found". 
Figure 1: A general schema for GSAT. 

2. GSAT 

If ^ is a clausal propositional formula and T is a truth assignment for the variables of 
(j), then the number of clauses of (j) which are falsified by T is called the score of T for (j) 
[score(T, (j))). T satisfies (j) iff score(T, (j)) = 0. The notion of score plays a key role in 
GSAT, as it is considered as the "distance" from a truth assignment to a satisfying one. 

The schema of Figure 2 describes GSAT as well as many of its possible variants. We use 
the notation from (Gent & Walsh, 1993). GSAT performs an iterative search for a satisfying 
truth assignment for (j), starting from a random assignment provided by initial (). At each 
step, the successive assignment is obtained by flipping (inverting) the truth value of one 
single variable V inT. V is chosen to minimize the score. Let Ti be the assignment obtained 
from T by flipping its z-th variable VJ. hill-climh() returns the set Poss-flips of the variables 
Vj. which minimize score(Tr, (p). If the current values of Asi = score(Ti, (j)) — score(T, (j)) 
are stored for every variable Vi, then hill-climh() simply returns the set of the variables V^. 
with the best As^- pick() chooses randomly one of such variables. flip() returns T with F's 
value flipped. After each flipping, UpdateScores() updates the values of As^, for all i. 

This paper exploits the observation that the functions initial(), hill-climb(), pick() and 
flip() do not depend on the structure of the input formula ^, and that the computation 
of the scores is the only step where the input formula (f) is required to be in clausal form. 
The idea is thus to find a suitable notion of score for non-clausal formulas, and an efficient 
algorithm computing it. 

3. An extended notion of score 

Let cnf((p) be the result of converting a propositional formula (p into clausal form by the 
standard method (i.e., by applying the rules of De Morgan). Then the following definition 
extends the notion of score to all propositional formulas. 

Definition 3.1 The score of a truth assignment T for a propositional formula (p is the 
number of the clauses of cnf((p) which are falsified by T . 
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Figure 2: The computation tree of s(T, (p). 

cnfO represents the "natural" clausal form conversion. cnf((p) has the same number of 
propositional variables as (p and it is logically equivalent to (p. The problem with cnf() is 
the exponential size growth of cnf((p), that is, \cnf[(p) \ = 0(2l*'l). Definition 3.1 overcomes 
such a problem, for it is possible to introduce a linear-time computable function s(T, (p) 
which gives the score of T for a formula (p. This is done directly, i.e., without converting (p 
into clausal form. We define s(T, (p) recursively as follows: ^ 





s{T, ^) 


s-{T, ^) 


(p literal 

Afe fk 
Vfe fk 

Vl = V2 


1 ifT^ip 
1 1 otherwise 
s-{T,<p,) 
Efe 4T, fk) 
Uk s{T, <Pk) 
s-(T,^i).s(T, ^2) 
s-(T,^i).s(T, ^2)+ 
s(T,^i).s-(T, ^2) 


1 1 ifT^^ 
1 otherwise 
s(T,^i) 

nfe5-(T, ^fe) 

Efe5-(T, ^fe) 

s(T,^i) + s-(T, ^2) 
(s(T,^i) + s-(T, ^2))- 
(s-(T,^i) + s(T, ^2)) 



s~ (T, (pk) is s(T, ^ipk)- The distinction between s(T, (pk) and s~ (T, ^j.) is due to the polarity 
of the current subformula (pk- During the computation of s(T,(p), a call to the function 
s(T, (pj) [s~(T, (pj)] is invoked iff (pj is a positive [negative] subformula of (p. 

Example 3.1 Figure 2 represents the computation tree of the score of a truth assignment 
T for the formula (p : 

(((^A A -.S A C) V V (^E A -.F)) A -.C A ((-.£> AAA^E) = (C A F)))V 
{D A^E A B)W {{{D A ^A) V (-.F ADA ^B) V -.F) A A A {{E A A F) W -.S)). 

T assigns "true" to all the variables of (p. The information in square brackets associated 
to any subformula (pj represents [s{T, (pj), s~(T, Vj)]- For instance, if we consider the small 
subtree in the left of Figure 2, then the score is computed in the following way: 



1. Notice that the definition of s[T, <f) can be easily extended to formulas involving other connectives (e.g., 
nand, nor, xor, if-then-else, . . . ) or more complicate boolean functions. 
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s(T, (^A A A C) V V (^E A -F) ) = ; s(T, V, V*) = EI* '(T, V*) 

s(T, ^AA^BAC)- s(T, ^D) ■ s(T, A^F)= ; s(T, A, Vk) = '(T, Vk) 

(s(T, -^A) + s(T, -.S) + s(T, C)) ■ s(T, -.£>) • (s(T, ^E) + s{T, -.F)) = ; literals 
(1 + 1 + 0) -1 -(1 + 1) =4. 

Notice that cnf[(p) is 360 clauses long. □ 



Theorem 3.1 Let (p be a propositional formula and T a truth assignment for the variables 
of (p. Then the function s(T, (p) gives the score ofT for (p. 

The proof follows from the consideration that, for any truth assignment T, the set of the 
false clauses of cnf((pi V 'P2) is the cross product between the two sets of the false clauses 
of cnf((pi) and cnf((p2). 

Theorem 3.2 Let (p be a propositional formula and T a truth assignment for the variables 
of (p. Then the number of operations required for calculating s(T, (p) grows linearly with the 
size of (p. 

The proof follows from the fact that, if Time(s^(<pi, T)) is the number of operations required 
for computing both s(T,(pi) and s~(T,(pi), and if Time(s^((pi,T)) < ai ■ \(pi\ + bi, then 
Time(s^(<pi o <p2,T)) < maxi(ai) ■ \<pi o <p2\ +2-maxi(bi) + 6, for any o G {A, V, D, = }. 

The number of operations required for computing the score of an assignment T for 
a clausal formula (f> is 0(|^|). If (f> = cnf((p), then \(f>\ = 0(2l*'l). Thus the standard 
computation of the score of T for (j) requires 0(2l*'l) operations, while s(T, (p) performs the 
same result directly in linear time. 

4. GSAT for non-clausal formulas 

It follows from Sections 2, 3 that we can extend GSAT to non-clausal formulas (p by simply 
using the extended notion of score of Definition 3.1. Let NC-GSAT (non-clausal GSAT) 
be a new version of GSAT in which the scores are computed by some implementation of 
the function s(). Then it follows from Theorem 3.1 that in NC-GSAT((p) the function hill- 
climb() always returns the same sets of variables as in GSAT(cnf((p)), so that NC-GSAT((p) 
performs the same flips and returns the same result as GSAT(cnf((p)). Theorem 3.2 ensures 
that every score computation is performed in linear time. 

The current implementation of GSAT (Selman & Kautz, 1993) provides a highly- 
optimized implementation of Updates cores ((j),V), which analyzes only the clauses which 
the last-flipped variable V occurs in. This allows a strong reduction in computational cost. 
In (Sebastiani, 1994) we describe in detail an analogous optimized version of the updating 
procedure for NC-GSAT, called NC-Updatescores((p, V), and prove the following properties: 

(i) if (p is in clausal form, i.e., (p = cnf[(p), then NC-UpdateScores((p,V) has the same 
complexity as UpdateS cores ((p, V); 

(ii) if(t> = cnf{ip), then NC-UpdateScores(ip, V) is 0{\<p\). UpdateS cores ((f), V) is 0(2l*'l). 

The latter mirrors the complexity issues presented in Section 3. 
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The idea introduced in this paper can be applied to most variants of GSAT. In "CSAT" 
(Cautious SAT) hill-climh() returns all the variables which cause a decrease of the score; 
in "DSAT" (Deterministic SAT) the function pick() performs a deterministic choice; in 
"RSAT" (Random walk SAT) the variable is picked randomly among all the variables; in 
"MSAT" (Memory SAT) pick() remembers the last flipped variable and avoids picking it. 
All these variants, proposed in (Gent & Walsh, 1992, 1993), can be transposed into NC- 
GSAT as well, as they are independent of the structure of the input formula. Selman and 
Kautz (1993) suggest some variants which improve the performance and overcome some 
problems, such as that of escaping local minima. The strategy Averaging in" suggests a 
different implementation of the function initial (): instead of a random assignment, initial () 
returns a bitwise average of the best assignments of the two latest cycles. This is independent 
of the form of the input formula. In the strategy "ranc^om walk" the sequence hill-climb() 
- pick() is substituted with probability p by a simpler choice function: "choose randomly a 
variable occurring in some unsatisfied clause". This idea can be transposed into NC-GSAT 
as well: "choose randomly a branch passing only for nodes whose score is different from 
zero, and pick the variable at the leaf". 

One final observation is worth making. In order to overcome the exponential growth of 
CNF formulas, some algorithms have been proposed (Plaisted & Greenbaum, 1986; de la 
Tour, 1990) which convert propositional formulas (p into polynomial-size clausal formulas ^. 
Such methods are based on the introduction of new variables, each representing a subformula 
of the original input (p. Unfortunately, the issue of size-polynomiality is valid only if no "=" 
occurs in (p, as the number of clauses of ^ grows exponentially with the number of "=" in 
<p. Even worse, the introduction of k new variables enlarges the search space by a 2^^ factor 
and reduces strongly the solution ratio. In fact, any model for ^ is also a model for (p, but 
for any model of <p we only know that one of its 2^^ extensions is a model of ip (Plaisted & 
Greenbaum, 1986). 
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